Improving propensity score weighting using machine learning.

نویسندگان

  • Brian K Lee
  • Justin Lessler
  • Elizabeth A Stuart
چکیده

Machine learning techniques such as classification and regression trees (CART) have been suggested as promising alternatives to logistic regression for the estimation of propensity scores. The authors examined the performance of various CART-based propensity score models using simulated data. Hypothetical studies of varying sample sizes (n=500, 1000, 2000) with a binary exposure, continuous outcome, and 10 covariates were simulated under seven scenarios differing by degree of non-linear and non-additive associations between covariates and the exposure. Propensity score weights were estimated using logistic regression (all main effects), CART, pruned CART, and the ensemble methods of bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, per cent absolute bias, and 95 per cent confidence interval (CI) coverage. All methods displayed generally acceptable performance under conditions of either non-linearity or non-additivity alone. However, under conditions of both moderate non-additivity and moderate non-linearity, logistic regression had subpar performance, whereas ensemble methods provided substantially better bias reduction and more consistent 95 per cent CI coverage. The results suggest that ensemble methods, especially boosted CART, may be useful for propensity score weighting.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving propensity score estimators' robustness to model misspecification using super learner.

The consistency of propensity score (PS) estimators relies on correct specification of the PS model. The PS is frequently estimated using main-effects logistic regression. However, the underlying model assumptions may not hold. Machine learning methods provide an alternative nonparametric approach to PS estimation. In this simulation study, we evaluated the benefit of using Super Learner (SL) f...

متن کامل

An Application of Non-response Bias Reduction Using Propensity Score Methods

‎In many statistical studies some units do not respond to a number or all of the questions‎. ‎This situation causes a problem called non-response‎. ‎Bias and variance inflation are two important consequences of non-response in surveys‎. ‎Although increasing the sample size can prevented variance inflation‎, ‎but cannot necessary adjust for the non-response bias‎. ‎Therefore a number of methods ...

متن کامل

Application of Machine Learning Classifiers and Regularization in Econometric Theory

As machine learning techniques become more popular and computers become capable of storing and processing large quantities of data, there have been many recent efforts to incorporate such techniques into structural econometric models. My research aims to extend this literature by introducing the techniques of regularization and classification (from machine learning) into Generalized Method of M...

متن کامل

Modeling Approaches for Cost and Cost-Effectiveness Estimation Using Observational Data

The estimation of treatment effects on medical costs and cost effectiveness measures is complicated by the need to account for non-independent censoring, skewness and the effects of confounders. In this dissertation, we develop several cost and cost-effectiveness tools that account for these issues. Since medical costs are often collected from observational claims data, we investigate propensit...

متن کامل

Comparing Weighting Methods in Propensity Score Analysis

The propensity score method is frequently used to deal with bias from standard regression in observational studies. The propensity score method involves calculating the conditional probability (propensity) of being in the treated group (of the exposure) given a set of covariates, weighting (or sampling) the data based on these propensity scores, and then analyzing the outcome using the weighted...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics in medicine

دوره 29 3  شماره 

صفحات  -

تاریخ انتشار 2010